A Survey of Bitmap Index-Compression Algorithms for Big Data
نویسندگان
چکیده
With the growing popularity of Internet applications and the widespread use of mobile Internet, Internet traffic has maintained rapid growth over the past two decades. Internet Traffic Archival Systems (ITAS) for packets or flow records have become more and more widely used in network monitoring, network troubleshooting, and user behavior and experience analysis. In this paper, we survey bitmap-index compression algorithms for traffic archival systems. The current state-of-the-art bitmap-index encoding schemes include: BBC, WAH, PLWAH, EWAH, PWAH, CONCISE, and COMPAX. Based on differences in segmentation, chunking, merge compress, and Near Identical (NI) features, we provide a thorough categorization of the state-of-the-art bitmap compression algorithms. We also propose some new bitmap encoding algorithms-SECOMPAX, ICX, MASC, PLWAH+-and show the state diagrams for their encoding algorithms. We then evaluate their CPU and GPU implementations with a real Internet trace from CAIDA. Finally, we summarize and discuss the future direction of bitmap-index compression algorithms.
منابع مشابه
Improving Bitmap Index Compression by Data Reorganization
The volume of data generated by scientific applications through observations or computer simulations can reach to the order of the petabytes. This brings up the need for effective and compact indexing methods for efficient storage and retrieval of scientific data. Bitmap indexing has been successfully applied in this domain by exploiting the fact that scientific data are mostly read-only and en...
متن کاملBitmap Indices for Speeding Up High-Dimensional Data Analysis
Bitmap indices have gained wide acceptance in data warehouse applications and are an efficient access method for querying large amounts of read-only data. The main trend in bitmap index research focuses on typical business applications based on discrete attribute values. However, scientific data that is mostly characterised by non-discrete attributes cannot be queried efficiently by currently s...
متن کاملData Compression for Bitmap Indexes
Compression Ratio (CR) and Logical Operation Time (LOT) are two major measures of the efficiency of bitmap indexing. Previous works by [5, 9, 10, 11] compare the performance of bitmap compression schemes conducted separately on logical operation time and compression ratio. This paper will describe these works and recommend for consideration a new matrix – overall efficiency indicator. The overa...
متن کاملBitmap Indices for Data Warehouses
In this chapter we discuss various bitmap index technologies for efficient query processing in data warehousing applications. We review the existing literature and organize the technology into three categories, namely bitmap encoding, compression and binning. We introduce an efficient bitmap compression algorithm and examine the space and time complexity of the compressed bitmap index on large ...
متن کاملGenetic Algorithms and Cellular Automata: unraveling the Bitmap Problem
Using Genetic Algorithms to evolve Cellular Automata rules to solve a given problem is a well-known method. The Bitmap Problem however, with it’s versatile and challenging characteristics, remains quite unknown. This thesis focuses on the Bitmap Problem and tries to expose its inner workings, possibilities and pitfalls. Multiple aspects of the Bitmap Problem like grid size, state set and updati...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014